Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction

نویسندگان

  • Scott Cederberg
  • Dominic Widdows
چکیده

In this paper we demonstrate methods of improving both the recall and the precision of automatic methods for extraction of hyponymy (IS A) relations from free text. By applying latent semantic analysis (LSA) to filter extracted hyponymy relations we reduce the rate of error of our initial pattern-based hyponymy extraction by 30%, achieving precision of 58%. Applying a graph-based model of noun-noun similarity learned automatically from coordination patterns to previously extracted correct hyponymy relations, we achieve roughly a fivefold increase in the number of correct hyponymy relations extracted.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using LSA and Noun Coordination Information to Improve the Recall and Precision of Automatic Hyponymy Extraction

In this paper we demonstrate methods of improving both the recall and the precision of automatic methods for extraction of hyponymy (IS A) relations from free text. By applying latent semantic analysis (LSA) to filter extracted hyponymy relations we reduce the rate of error of our initial pattern-based hyponymy extraction by 30%, achieving precision of 58%. Applying a graph-based model of noun-...

متن کامل

Japanese Hyponymy Extraction based on a Term Similarity Graph

Semantic relations between words, such as hyponymy, synonymy and meronymy, have various information access applications (e.g. Web search) and the automatic extraction of such relations from corpora is an important research problem in natural language processing. For the Japanese language, there exist several linguistic resources that contain these relations, such as the Japanese Wordnet, Nihong...

متن کامل

Noun-Phrase Analysis in Unrestricted Text for Information Retrieval

Information retrieval is an important application area of natural-language processing where one encounters the genuine challenge of processing large quantities of unrestricted natural-language text. This paper reports on the application of a few simple, yet robust and efficient nounphrase analysis techniques to create better indexing phrases for information retrieval. In particular, we describe...

متن کامل

بهبود خلاصه سازی خودکار متون فارسی با استفاده از روش‌های پردازش زبان طبیعی و گراف شباهت

A significant amount of available information is stored in textual databases which contains a large collection of documents from different sources (such as news, articles, books, emails and web pages). The increasing visibility and importance of this class of information motivates us to work on having better automatic evaluation tools for textual resources. The automatic summarization of tex...

متن کامل

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003